What Do We Mean When We Speak About Named Entities?
نویسندگان
چکیده
The concept of Named Entity (NE) has its origin in the Named Entity Recognition and Classification (NERC) tasks, an offspring of Information Retrieval systems, and became one of the main interest points in the Sixth and Seventh Message Understanding Conference (MUC-6, MUC-7) competitions, held back in 1995 and 1998 (Cinchor, 1997; Black, 1998). From then on, most competitions and programs include at least one task related to it. That is the case, for instance, of the Automatic Content Extraction (ACE) program, which began in 1999, and whose aim is to develop technologies in order to automatically infer from human languages the entities being mentioned, their relations and the events in which they participate (Doddington et al., 2004). So, the concept of NE appeared in an environment of NLP applications and is far from being linguistically clear and settled. Although there are some disciplines like linguistics and language analysis which study some particular items that share some characteristics with it, they do not fill in the needs to properly set solid criteria to handle NERC tasks. On the one hand, the classic grammatical approach to Proper Noun (PN) analysis is surely insufficient to deal with the problems NERC poses. In one of the reference contributions on the matter in Spanish (Fernández Leborans, 1999), PNs are defined in their prototypical uses in regard of a series of common but not unique features, such as capitalization, lack of inflection, lack of lexical meaning, lack of determiner, lack of translation, monoreferenciality, or incompatibility with restrictive complementation. As can be seen, that approach cares only about PNs in the strict linguistic sense, so it leaves a vast amount of cases out of consideration (such as alphanumerical expressions or temporal expressions) which are relevant when talking about NEs. This proposal has proven to be too narrow to be used in NLP, especially in the case of weak named entity detection and classification, for weak named entities usually do not even contain a PN. On the other hand, the philosophical approach to language and reference, via the concepts of Singular Terms and Definite Descriptions, is far too wide. Summarizing a lot of bibliography in a few words (Fernández Moreno, 2006), Singular Terms include Proper Names, singular Indexical Terms and singular Definite Descriptions. Definite Descriptions are expressions introduced by a singular definite article which predicate a property possessed by a single individual. Singular Indexical Terms include paradigmatically both pronouns (personal and demonstrative) and some adverbs (place and time deictics). Finally, PNs are defined by two features: their reference does not depend on the context of emission, and it is not dictated by the internal structure of the name itself. This approach deals with a number of linguistic phenomena which should not be part of a linguistically-based approach to NE (at least in most of the cases). In fact, most Reference Theories treat Proper Names, most pronouns and any definite noun phrase as a definite expression, whereas NERC systems care only about a subgroup of
منابع مشابه
Diagnostic and therapeutic challenges for dermatologists: What shall we do when we don’t know what to do?
What shall we do when we have done everything we could for the diagnosis and treatment of a patient, but were not successful? What shall we do when there is no definite treatment for a patient? What shall we do when we have no diagnosis or treatment for a patient? Some useful suggestions are presented here to get rid of these situations.
متن کاملHandicrafts, Encountering Modern Technology
This article aimes to emphasize certain points concerning traditional art , and to put forward a question. As the term” Traditional art” is rather ambiguous, first we try to clarify it. To do so, we propose an approach somehow different from one generally admitted. Thereby, we discuss the reasons why it is not so easy to give a definition of traditional art, islamic art in particular, specially...
متن کاملPAYMA: A Tagged Corpus of Persian Named Entities
The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...
متن کاملHigh Stakes Require More Than Just Talk: What to Do About Corruption in Health Systems; Comment on “We Need to Talk About Corruption in Health Systems”
Reluctance to talk about corruption is an important barrier to action. Yet the stakes of not addressing corruption in the health sector are higher than ever. Corruption includes wrongdoing by individuals, but it is also a problem of weak institutions captured by political interests, and underfunded, unreliable administrative systems and healthcare delivery models. We ur...
متن کاملروشی جدید جهت استخراج موجودیتهای اسمی در عربی کلاسیک
In Natural Language Processing (NLP) studies, developing resources and tools makes a contribution to extension and effectiveness of researches in each language. In recent years, Arabic Named Entity Recognition (ANER) has been considered by NLP researchers due to a significant impact on improving other NLP tasks such as Machine translation, Information retrieval, question answering, query result...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007